<!DOCTYPE html>
<html>
<head>
  <meta charset="utf-8">
  

  
  <title>regression | Chen_Blog</title>
  <meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1">
  <meta name="description" content="线性回归 linear regression设训练数据样本集为1000，输入个数为2，给定随机产生的批量样本特征$\boldsymbol{X} \in \mathbb{R}^{1000\times2}$我们使用真实的线性回归模型，真实的权重$\boldsymbol{w}=[2,-3.4]^\top$和偏差$b = 4.2$添加一个噪音项，$\epsilon$生成目标变量$\boldsymbo">
<meta name="keywords" content="linear regression">
<meta property="og:type" content="article">
<meta property="og:title" content="regression">
<meta property="og:url" content="http://yoursite.com/2018/12/05/regression/index.html">
<meta property="og:site_name" content="Chen_Blog">
<meta property="og:description" content="线性回归 linear regression设训练数据样本集为1000，输入个数为2，给定随机产生的批量样本特征$\boldsymbol{X} \in \mathbb{R}^{1000\times2}$我们使用真实的线性回归模型，真实的权重$\boldsymbol{w}=[2,-3.4]^\top$和偏差$b = 4.2$添加一个噪音项，$\epsilon$生成目标变量$\boldsymbo">
<meta property="og:locale" content="default">
<meta property="og:updated_time" content="2018-12-13T12:37:03.596Z">
<meta name="twitter:card" content="summary">
<meta name="twitter:title" content="regression">
<meta name="twitter:description" content="线性回归 linear regression设训练数据样本集为1000，输入个数为2，给定随机产生的批量样本特征$\boldsymbol{X} \in \mathbb{R}^{1000\times2}$我们使用真实的线性回归模型，真实的权重$\boldsymbol{w}=[2,-3.4]^\top$和偏差$b = 4.2$添加一个噪音项，$\epsilon$生成目标变量$\boldsymbo">
  
    <link rel="alternate" href="/atom.xml" title="Chen_Blog" type="application/atom+xml">
  
  
    <link rel="icon" href="/favicon.png">
  
  
    <link href="//fonts.googleapis.com/css?family=Source+Code+Pro" rel="stylesheet" type="text/css">
  
  <link rel="stylesheet" href="/css/style.css">
</head>

<body>
  <div id="container">
    <div id="wrap">
      <header id="header">
  <div id="banner"></div>
  <div id="header-outer" class="outer">
    <div id="header-title" class="inner">
      <h1 id="logo-wrap">
        <a href="/" id="logo">Chen_Blog</a>
      </h1>
      
        <h2 id="subtitle-wrap">
          <a href="/" id="subtitle">业精于勤，荒于嬉；行成于思，毁于随。</a>
        </h2>
      
    </div>
    <div id="header-inner" class="inner">
      <nav id="main-nav">
        <a id="main-nav-toggle" class="nav-icon"></a>
        
          <a class="main-nav-link" href="/">Home</a>
        
          <a class="main-nav-link" href="/archives">Archives</a>
        
      </nav>
      <nav id="sub-nav">
        
          <a id="nav-rss-link" class="nav-icon" href="/atom.xml" title="RSS Feed"></a>
        
        <a id="nav-search-btn" class="nav-icon" title="Search"></a>
      </nav>
      <div id="search-form-wrap">
        <form action="//google.com/search" method="get" accept-charset="UTF-8" class="search-form"><input type="search" name="q" class="search-form-input" placeholder="Search"><button type="submit" class="search-form-submit">&#xF002;</button><input type="hidden" name="sitesearch" value="http://yoursite.com"></form>
      </div>
    </div>
  </div>
</header>
      <div class="outer">
        <section id="main"><article id="linear-regression" class="article article-type-linear" itemscope itemprop="blogPost">
  <div class="article-meta">
    <a href="/2018/12/05/regression/" class="article-date">
  <time datetime="2018-12-05T14:14:35.000Z" itemprop="datePublished">2018-12-05</time>
</a>
    
  </div>
  <div class="article-inner">
    
    
      <header class="article-header">
        
  
    <h1 class="article-title" itemprop="name">
      regression
    </h1>
  

      </header>
    
    <div class="article-entry" itemprop="articleBody">
      
        <h1 id="线性回归-linear-regression"><a href="#线性回归-linear-regression" class="headerlink" title="线性回归 linear regression"></a>线性回归 linear regression</h1><p>设训练数据样本集为1000，输入个数为2，给定随机产生的批量样本特征$\boldsymbol{X} \in \mathbb{R}^{1000\times2}$我们使用真实的线性回归模型，真实的权重$\boldsymbol{w}=[2,-3.4]^\top$和偏差$b = 4.2$添加一个噪音项，$\epsilon$生成目标变量$\boldsymbol{Y}$<br>$$\boldsymbol{Y}=\boldsymbol{w} \boldsymbol{X}+b +\epsilon$$<br>此示例，其中噪音项$\epsilon\sim N(0,0.001)$正态分布。生成数据集如下：<br>$$y[i]=w[0] <em> X[i][0] + w[1] </em> X[i][1] + b + \epsilon $$</p>
<p>损失函数：</p>
<p>$$squared-loss = argmin\frac{1}{2}\sum (\hat y -y)^2$$<br>$$\frac{\partial E_{w,b}}{\partial w} = 2 (w \sum^m_{i=1}x^2_i - \sum^m_{i =1}(y_i-b)x_i )$$</p>
<p>$$\frac{\partial E_{w,b}}{\partial b} = 2 (mb - \sum^m_{i =1}(y_i-wx_i) )$$</p>
<p>$$ \frac{\partial E_{w,b}}{\partial w}=0 $$<br>$$ \frac{\partial E_{w,b}}{\partial b}=0 $$<br>1.<br>$$b = \frac{1}{m}\sum^m_{i=1}(y_i - wx_i)$$<br>2.<br>$$ \frac{\partial E_{w,b}}{\partial w}=0 $$<br>3.<br>$$w\sum^m_{i=1}x^2_i-\sum^m_{i =1}(y_ix_i-bx_i)=0$$<br>4.<br>$$w\sum^m_{i=1}x^2_i-\sum^m_{i =1}y_ix_i+\sum^m_{i =1}bx_i=0$$<br>5.<br>$$w\sum^m_{i=1}x^2_i-\sum^m_{i =1}y_ix_i+b\sum^m_{i =1}x_i=0$$<br>6.<br>$$w\sum^m_{i=1}x^2_i-\sum^m_{i =1}y_ix_i+(\frac{1}{m}\sum^m_{i=1}(y_i - wx_i))\sum^m_{i =1}x_i=0$$<br>7.<br>$$w\sum^m_{i=1}x^2_i-\sum^m_{i =1}y_ix_i+\frac{1}{m}\sum^m_{i=1}y_i\sum^m_{i =1}x_i -\frac{1}{m}\sum^m_{i=1} wx_i\sum^m_{i =1}x_i=0$$<br>8.<br>$$w(\sum^m_{i=1}x^2_i-\frac{1}{m}(\sum^m_{i =1}x_i)^2)=\sum^m_{i =1}y_ix_i-\sum^m_{i=1}y_i\bar x$$<br>9.<br>$$w = \frac{\sum^m_{i =1}y_i(x_i-\bar x)}{\sum^m_{i=1}x^2_i-\frac{1}{m}(\sum^m_{i =1}x_i)^2}$$</p>
<p>$$w = \frac{\sum^m_{i=1}y_i(x_i-\bar x)}{\sum^m_{i=1}x^2_i-\frac{1}{m}(\sum^m_{i=1}x_i)^2}$$<br>$$b = \frac{1}{m}\sum^m_{i=1}(y_i - wx_i)$$</p>
<p>arg    是变元（即自变量argument）的英文缩写。<br>arg min 就是使后面这个式子达到最小值时的变量的取值<br>arg max 就是使后面这个式子达到最大值时的变量的取值</p>
<p>例如 函数F(x,y):</p>
<p>arg  min F(x,y)就是指当F(x,y)取得最小值时，变量x,y的取值</p>
<p>arg  max F(x,y)就是指当F(x,y)取得最大值时，变量x,y的取值</p>
<p>通过梯度求导 得到适合的 w 和 b</p>
<p>为了便于讨论，我们把$w$和$b$写成向量的形式$\hat w = (w,b)$，相应的，把数据集D表示为一个$m\times (d+1)$大小的矩阵$X$，则有$$\hat w^* =argmin_{\hat w}(y - X \hat w)^T(y - X \hat w)$$<br>令$E_{\hat w}=(y - X \hat w)^T(y - X \hat w)$ 对$\hat w$进行求导<br>1.<br>$$<br>\begin{aligned}<br>&amp; \frac{\partial E \hat w}{\partial w} =-2X^T (y-X\hat w) \<br>&amp;<br>\end{aligned}<br>$$<br>令$\frac{\partial E_{\hat w} }{\partial \hat w}=0$,可得到$\hat w$的最优解$$\hat w = (X^TX)^{-1}X^Ty$$<br>令$\hat{x}_i=(x_i,1)$,则多元线性回归模型为<br>$$<br>f(\hat{x}_i)=\hat{x_i}^T(X^TX)^{-1}X^Ty<br>$$<br>另外一种推导方式<br>同样地，我们使用最小二乘法对w和b进行估计，令均方误差的求导等于0，计算过程如下：<br>   令$E_{\hat{w}}=(y-x\hat{w})^T(y-x\hat{w})$  </p>
<p>   $$<br>   \begin{aligned}<br>   E_{\hat{w}}&amp;=(y^T-(X\hat{w})^T))(y-x\hat{w})\<br>   &amp;=(y^T-X^T\hat{w}^T)(y-x\hat{w})\<br>   &amp;=(X^T\hat{w}^T-y^T)(x\hat{w}-y)\<br>   &amp;=\hat{w}^TX^TX\hat{w}-y^TX\hat{w}-\hat{w}^TX^Ty+y^Ty\<br>   &amp;=\hat{w}^TX^TX\hat{w}-y^TX\hat{w}-y^TX\hat{w}+y^Ty\<br>   &amp;=\hat{w}^TX^TX\hat{w}-2y^TX\hat{w}+y^Ty<br>   \end{aligned}<br>   $$<br>   对$\hat{w}$求导<br>   $$<br>   \begin{aligned}<br>   \frac{\partial{E_{\hat{w}}}}{\partial{\hat{w}}}&amp;=(X^TX+(X^TX)^T)\hat{w}-2y^TX,\frac{\partial{X^TAX}}{\partial{X}}=(A+A^T)X\<br>   &amp;=2X^TX\hat{w}-2X^Ty<br>   \end{aligned}<br>   $$<br>   令上式=0，可得<br>   $$<br>   \hat{w}^*=(X^TX)^{-1}X^Ty<br>   $$<br>   其中$(X^TX)^{-1}$是$(X^TX)$的逆矩阵，$(X^TX)$是满秩矩阵或非奇异矩阵<br>   令$\hat{x}_i=(x_i,1)$,则多元线性回归模型为<br>   $$<br>   f(\hat{x}_i)=\hat{x_i}^T(X^TX)^{-1}X^Ty<br>   $$</p>
<h2 id="对数几率回归"><a href="#对数几率回归" class="headerlink" title="对数几率回归"></a>对数几率回归</h2><p>使用线性模型进行回归学习。<br>“单位阶跃函数”<br>$$ y = \left {<br>    \begin{aligned}<br>    0 , &amp; z&lt;0;\<br>    0.5, &amp; z=0;\<br>    1, &amp; z&gt;0;<br>    \end{aligned}<br>\right.<br>$$<br>由于单位阶跃函数不是连续的，所以不能直接求导，所以我们希望能在找到可以替代的连续可微的单调函数，如下：<br>$$ y = \frac{1}{1+e^{-z}}$$<br>对数几率函数是一种”Sigmoid函数”<br>$$\begin{aligned} &amp;&amp; y = \frac{1}{1+e^{-(\boldsymbol W^Tx+b)}}\<br>&amp;&amp; y(1+e^{-(\boldsymbol W^Tx+b)})=1 \<br>&amp;&amp; e^{-(\boldsymbol W^Tx+b)} = \frac{1}{y} -1\<br>&amp;&amp; -(\boldsymbol W^Tx+b) = ln\frac{1-y}{y} \<br>&amp;&amp; -(\boldsymbol W^Tx+b )= ln(\frac{y}{1-y})^{-1} \<br>&amp;&amp; -(\boldsymbol W^Tx+b) = ln\frac{1-y}{y} \<br>&amp;&amp; -(\boldsymbol W^Tx+b )= -ln(\frac{y}{1-y}) \<br>&amp;&amp; (\boldsymbol W^Tx+b )= ln(\frac{y}{1-y})<br>\end{aligned}<br>$$<br> 如果将y视为样本x的正例的可能性，则1-y是其反例可能性，两者的比率为$\frac{y}{1-y}$对数几率回归又称为logit regression 虽然是叫做回归但是解决的是分类问题。</p>

      
    </div>
    <footer class="article-footer">
      <a data-url="http://yoursite.com/2018/12/05/regression/" data-id="cjptgn1mb000dev154uqvwm0s" class="article-share-link">Share</a>
      
      
  <ul class="article-tag-list"><li class="article-tag-list-item"><a class="article-tag-list-link" href="/tags/linear-regression/">linear regression</a></li></ul>

    </footer>
  </div>
  
    
<nav id="article-nav">
  
    <a href="/2018/12/06/linear-regression-sources-md/" id="article-nav-newer" class="article-nav-link-wrap">
      <strong class="article-nav-caption">Newer</strong>
      <div class="article-nav-title">
        
          linear-regression-sources.md
        
      </div>
    </a>
  
  
    <a href="/2018/08/21/PseudoBaseStation/" id="article-nav-older" class="article-nav-link-wrap">
      <strong class="article-nav-caption">Older</strong>
      <div class="article-nav-title">PseudoBaseStation</div>
    </a>
  
</nav>

  
</article>

</section>
        
          <aside id="sidebar">
  
    

  
    
  <div class="widget-wrap">
    <h3 class="widget-title">Tags</h3>
    <div class="widget">
      <ul class="tag-list"><li class="tag-list-item"><a class="tag-list-link" href="/tags/Machine-Learning/">Machine Learning</a></li><li class="tag-list-item"><a class="tag-list-link" href="/tags/linear-regression/">linear regression</a></li></ul>
    </div>
  </div>


  
    
  <div class="widget-wrap">
    <h3 class="widget-title">Tag Cloud</h3>
    <div class="widget tagcloud">
      <a href="/tags/Machine-Learning/" style="font-size: 20px;">Machine Learning</a> <a href="/tags/linear-regression/" style="font-size: 10px;">linear regression</a>
    </div>
  </div>

  
    
  <div class="widget-wrap">
    <h3 class="widget-title">Archives</h3>
    <div class="widget">
      <ul class="archive-list"><li class="archive-list-item"><a class="archive-list-link" href="/archives/2018/12/">December 2018</a></li><li class="archive-list-item"><a class="archive-list-link" href="/archives/2018/08/">August 2018</a></li><li class="archive-list-item"><a class="archive-list-link" href="/archives/2018/07/">July 2018</a></li></ul>
    </div>
  </div>


  
    
  <div class="widget-wrap">
    <h3 class="widget-title">Recent Posts</h3>
    <div class="widget">
      <ul>
        
          <li>
            <a href="/2018/12/18/SGD/">SGD</a>
          </li>
        
          <li>
            <a href="/2018/12/18/gradient-descent/">gradient_descent</a>
          </li>
        
          <li>
            <a href="/2018/12/06/linear-regression-gluon/">linear-regression-gluon</a>
          </li>
        
          <li>
            <a href="/2018/12/06/linear-regression-sources-md/">linear-regression-sources.md</a>
          </li>
        
          <li>
            <a href="/2018/12/05/regression/">regression</a>
          </li>
        
      </ul>
    </div>
  </div>

  
</aside>
        
      </div>
      <footer id="footer">
  
  <div class="outer">
    <div id="footer-info" class="inner">
      &copy; 2018 alex Chen<br>
      Powered by <a href="http://hexo.io/" target="_blank">Hexo</a>
    </div>
  </div>
</footer>
    </div>
    <nav id="mobile-nav">
  
    <a href="/" class="mobile-nav-link">Home</a>
  
    <a href="/archives" class="mobile-nav-link">Archives</a>
  
</nav>
    

<script src="//ajax.googleapis.com/ajax/libs/jquery/2.0.3/jquery.min.js"></script>


  <link rel="stylesheet" href="/fancybox/jquery.fancybox.css">
  <script src="/fancybox/jquery.fancybox.pack.js"></script>


<script src="/js/script.js"></script>



  </div>
</body>
</html>